MGMT 675
AI-Assisted Financial Analysis

Classification
Categorical target variables
- Binary (off/on, yes/no, …)
- Multiclass
- Same sets of models: linear, trees, neural nets, …
Binary Examples
- Random forest
- Gradient boosting
- Linear (logistic regression)
Binary Data
- Upload irrelevant_features.xlsx to Julius
- Ask Julius to read it
- Tell Julius that y2 is the target variable and x1 through x50 are the features
- y2 is a “high-low” version of y1 = x1 + noise.
Random forest
- Ask Julius to do a train-test split and train a random forest on the training data.
- Ask Julius to produce a confusion matrix for the training data and a confusion matrix for the test data.
- Ask Julius to produce a ROC curve for the test data and to explain it.
Linear model (logistic regression)
- For binary variables but can be extended
- Transform binary variable to 0 and 1 dummy variable
- Choose parameters \(\alpha\), \(\beta_i\) to maximize fit of \[ \frac{1}{1+e^{-\alpha - \beta_1 x_1 - \cdots \beta_n x_n}}\] to the dummy variable.
- Can do shrinkage